Regular expressions used to obfuscate Javascript code

Just yesterday I had the opportunity to take a look at a sort of obfuscated Javascript code I have never seen before. The script contains a class named KyD defined using the prototype pattern. The code is something like this:
function KyD() {};


KyD.prototype = {

install : function()

{

 ...

},

cookieName:'feadcbhg',

getFrameURL : function()

{

 ...

},

...

};

var o44o=new KyD(); o44o.install();
More or less a standard class declaration. The constructor is empty, it doesn’t need special initial operation. Just after the class definition there are two more lines, a new KyD object is declared and the method “install” will be called.

For me it’s quite uncommon to see a class declaration inside a malicious script, I’m always used to see Javascript code using procedural paradigm. Anyway, this is not a problem of course. The problem arises looking at the declared methods. It’s often easy to understand a Javascript function from the source code, but not this time. Look at this snippet taken from one of the method declared inside KyD class:

Are you able to tell me the content of “o” in few seconds? Even if you know how to handle s you’ll need more than few seconds in order to solve the puzzle.
How to sort out the real meaning of the string? The script has been obfuscated using regular expressions; nothing impossible, but if you want to identify the content of the string s you need to know something about regexp.

How can regexp be used to obfuscate a string?
The string s is composed by 3 parts, two of them are obfuscated substrings while the other one is obtained by getFrameURL, another method of the class KyD.
The substrings have a replace method applied, in this specific case the method is used to search and replace characters from the string with regular expressions. The method is originally used to replace some characters with some other characters in a string:

stringObject.replace(findstring,newstring)

Here is how to use the method:
var s = "Say Hello"; document.write(str.replace(/Hello/, 'Ciao'));
The output will be “Say Ciao”, pretty easy. It’s also possible to use some more options, i.e.:
– i: used to perform a case insensitive search
– g: used to perform a global search over the entire string.

Back to our snippet. Looking at the first substring you’ll see that the replace method is used in this way:

replace(/[%\)@QI]/g, '')

g option is present and the new string is NULL, it means that part of the string will be cutted away. Which part of the string will be removed? The string to find is defined as a regular expression, everything inside square brackets (‘[‘ and ‘]’) will be replaced with NULL. Removing the specified characters from the substring you’ll obtain the de-obfuscated substring:

Now I can decode all the strings obtaining the original script!
Quite a nice trick. It forces you to spend some more time over a script, nothing more. Thanks to Bobby for the script.

3 comments on “Regular expressions used to obfuscate Javascript code”

Ian on May 20, 2008 at 6:24 am said:

Good One …

BTW Another Complete Example at
http://www.forum.dklab.ru/js/other/ObnarugilStranniyJsKodNaSvoihStranitsah.html

zairon on May 20, 2008 at 11:12 am said:

Thank you Ian, there are always precious information on Russian sites… too bad I can’t read cyrillic.

headfirst on May 28, 2008 at 12:32 pm said:

headfirst says : I absolutely agree with this !

My infected computer

something strange happens inside it